Facets (ggplot2)

Table of contents

Problem

You want to do split up your data by one or more variables and plot the subsets of data together.

Solution

Sample data

We will use the tips dataset from the reshape package.

library(ggplot2)  # This will load the reshape package automatically
tips
# total_bill   tip    sex smoker  day   time size
#      16.99  1.01 Female     No  Sun Dinner    2
#      10.34  1.66   Male     No  Sun Dinner    3
#      21.01  3.50   Male     No  Sun Dinner    3
#        ...
#      22.67  2.00   Male    Yes  Sat Dinner    2
#      17.82  1.75   Male     No  Sat Dinner    2
#      18.78  3.00 Female     No Thur Dinner    2

This is a scatterplot of the tip percentage by total bill size.

sp <- ggplot(tips, aes(x=total_bill, y=tip/total_bill)) + geom_point(shape=1)
sp

facet_grid

The data can be split up by one or two variables that vary on the horizontal and/or vertical direction.

This is done by giving a formula to facet_grid(), of the form vertical ~ horizontal.

# Divide by levels of "sex", in the vertical direction
sp + facet_grid(sex ~ .)

# Divide by levels of "sex", in the horizontal direction
sp + facet_grid(. ~ sex)

# Divide with "sex" vertical, "day" horizontal
sp + facet_grid(sex ~ day)

facet_wrap

Instead of faceting with a variable in the horizontal or vertical direction, facets can be placed next to each other, wrapping with a certain number of columns or rows. The label for each plot will be at the top of the plot.

# Divide by day, going horizontally and wrapping with 2 columns
sp + facet_wrap( ~ day, ncol=2)

Modifying facet label appearance

sp + facet_grid(sex ~ day) +
    opts(strip.text.x = theme_text(size=8, angle=75),
         strip.text.y = theme_text(size=12, face="bold"),
         strip.background = theme_rect(colour="red", fill="#CCCCFF"))

Modifying facet label text

Modifying the text of a facet label is somewhat complex. It is necessary to either (A) create a function which maps the original names to the new names, or (B) change the data frame.

To create a function which maps the levels of sex from Female==>Woman, and Male==>Man:

mf_labeller <- function(var, value){
    value <- as.character(value)
    if (var=="sex") { 
        value[value=="Female"] <- "Woman"
        value[value=="Male"]   <- "Man"
    }
    return(value)
}

sp + facet_grid(. ~ sex, labeller=mf_labeller)

To change the data frame:

tips2 <- tips
levels(tips2$sex)[levels(tips2$sex)=="Female"] <- "Woman"
levels(tips2$sex)[levels(tips2$sex)=="Male"]   <- "Man"
# total_bill  tip   sex smoker day   time size
#      16.99 1.01 Woman     No Sun Dinner    2
#      10.34 1.66   Man     No Sun Dinner    3
#       ...

sp2 <- ggplot(tips2, aes(x=total_bill, y=tip/total_bill)) + geom_point(shape=1)
sp2 + facet_grid(. ~ sex)

Both of these will give the same result:

Free scales

Normally, the axis scales on each graph are fixed, which means that they have the same size and range. They can be made independent, by setting scales to free, free_x, or free_y.

# A histogram of bill sizes
hp <- ggplot(tips, aes(x=total_bill)) + geom_histogram(binwidth=2,colour="white")

# Histogram of total_bill, divided by sex and smoker
hp + facet_grid(sex ~ smoker)

# Same as above, with scales="free_y"
hp + facet_grid(sex ~ smoker, scales="free_y")

# With panels that have the same scaling, but different range (and therefore different physical sizes)
hp + facet_grid(sex ~ smoker, scales="free", space="free")